This is a README for repeating our experiments and analyses. 
The `core` folder contains all the source code for the AdFisher tool we developed and used for our study. 
The `test-scripts` folder contains some scripts we used for our experiments. 
`site-files` contains some lists of websites we had some browsers visit in some experiments. 
The `data-logs` folder contains all the  logs we generated from our experiments and a python script `analyze.py` that can be used to repeat the analysis.

Requirements
-----------
AdFisher runs only on UNIX environments. It uses some standard packages listed here. 
The commands provided for installation work on Ubuntu and OS X. You may find it useful to install packages using `pip`. 
You can install `pip` by following the instructions provided [here](http://pip.readthedocs.org/en/latest/installing.html).
In order to run experiments for data collection, you will need the following packages:

  - selenium```sudo pip install selenium```
  - xvfb ```sudo apt-get install xvfb```
  - xvfbwrapper ```sudo pip install xvfbwrapper```

Selenium is a web-browser automation framework. Xvfb allows for headless testing. 
xvfbwrapper is a python wrapper for the same. 
The Xvfb package is not present on OS X, but you still have to install xvfbwrapper.
To carry out the data analysis, you require the following packages:
  - numpy, scipy, matplotlib ```sudo pip install numpy scipy matplotlib```*
  - scikit learn ```sudo pip install scikit-learn```
  - stemming ```sudo pip install stemming```
  - nltk ```sudo pip install -U pyyaml nltk```
     - You also need to download the nltk stopwords corpus by typing the following commands in your python interpreter. 
```
import nltk
nltk.download()
``` 
If pip* fails to install numpy, scipy, matplotlib on Ubuntu, run 
```sudo apt-get install python-numpy python-scipy python-matplotlib```.
NumPy and SciPy are Python packages for scientific computing. matplotlib enables plotting functions. 
scikit learn has a vast collection of python implemenations of Machine Learning algorithms, 
built on the NumPy, SciPy, and matplotlib packages. 
We use the stemming package to stem words, and the nltk stopwords corpus for identifying stopwords.

Running tests
-----------
Scripts for running some experiments have been provided in the `test-scripts` folder. All experiments can be performed through minor variations of these scripts. In order to run a script, simply run ```python <script>```. You can change certain parameters of the experiments by modifying the script. Please refer to the [github repository](https://github.com/tadatitam/info-flow-experiments) for more information on running experiments. 

Analyzing logs
-----------

All the logs generated from the experiments are provided in the folder `data-logs`. 
In order to perform an analysis, simply run the `analyze.py` script followed by the type of analysis and the log_file. 
For example, to perform the machine-learning based analysis on the log from the gender-discrimination experiment (May), run ```python analyze.py ml logs/nondiscrimination/log.genjobs.may.txt```. 
To perform the keyword based analysis on the log from the dating experiment testing ad choice, run ```python analyze.py kw logs/ad-choice/log.dating.txt```. 